A. Abrardo, "Design of optimal convolutional codes for joint decoding of correlated sources in wireless sensor networks" 



Design of optimal convolutional codes for 
joint decoding of correlated sources in 
wireless sensor networks 

Andrea Abrardo 

Department of Information Engineering - University of Siena 
Via Roma, 56 - 53100 Siena, ITALY 

Contact Author : Andrea Abrardo 

e-mail: abrardo@mg.unisi.it, Tel.: +39 0577 234624, Fax: +39 0577 233602 

Abstract 

We consider a wireless sensors network scenario where two nodes detect correlated sources and de- 
liver them to a central collector via a wireless link. Differently from the Slepian-Wolf approach 
to distributed source coding, in the proposed scenario the sensing nodes do not perform any pre- 
compression of the sensed data. Original data are instead independently encoded by means of 
low-complexity convolutional codes. The decoder performs joint decoding with the aim of exploiting 
the inherent correlation between the transmitted sources. Complexity at the decoder is kept low 
thanks to the use of an iterative joint decoding scheme, where the output of each decoder is fed to the 
other decoder's input as a-priori information. For such scheme, we derive a novel analytical frame- 
work for evaluating an upper bound of joint-detection packet error probability and for deriving the 
optimum coding scheme. Experimental results confirm the validity of the analytical framework, and 
show that recursive codes allow a noticeable performance gain with respect to non-recursive coding 
schemes. Moreover, the proposed recursive coding scheme allows to approach the ideal Slepian-Wolf 
scheme performance in AWGN channel, and to clearly outperform it over fading channels on account 
of diversity gain due to correlation of information. 

Index Terms - Convolutional codes, correlated sources, joint decoding, wireless sensor networks. 

I. Introduction 

Wireless sensor networks have recently received a lot of attention in the research 
literature [Ij . The efficient transmission of correlated signals observed at different nodes 
to one or more collectors, is one of the main challenges in such networks. In the case 
of one collector node, this problem is often referred to as reach-back channel in the 
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literature [2], [3], [1]. In its most simple form, the problem can be summarized as follows: 
two independent nodes have to transmit correlated sensed data to a collector node 
by using the minimum energy, i.e., by exploiting in some way the implicit correlation 
among data. In an attempt to exploit such correlation, many works have recently 
focussed on the design of coding schemes that approach the Slepian-Wolf fundamental 
limit on the achievable compression rates [5], [6], [7], [8]. However, approaching the 
Slepian-Wolf compression limit requires in general a huge implementation complexity 
at the transmitter (in terms of number of operations and memory requirements) that 
in many cases is not compatible with the needs of deploying very light-weight, low cost, 
and low consuming sensor nodes. Alternative approaches to distributed source coding 
are represented by cooperative source-channel coding schemes and joint source-channel 
coding. 

In a cooperative system, each user is assigned one or more partners. The partners 
overhear each other's transmitted signals, process these signals, and retransmit toward 
the destination to provide extra observations of the source signal at the collector. Even 
though the inter partner channel is noisy, the virtual transmit-antenna array consisting 
of these partners provides additional diversity, and may entail improvements in terms of 
error rates and throughput for all the nodes involved [9], [10], [11], [12] [13], [H]. This 
approach can take advantage of correlation among the different information flows simply 
by including Slepian-Wolf based source coding schemes, i.e., the sensing nodes transmit 
compressed version of the sensed data each other, so that cooperative source- channel 
coding schemes can be derived p^J . However, approaches based on cooperation require 
a strict coordination/synchronization among nodes, so that they can be considered as a 
single transmitter equipped with multiple antennas. This entails a more complex design 
of low level protocols and forces the nodes to fully decode signals from the other nodes. 
This operation is of course power consuming, and in some cases such an additional power 
can partially or completely eliminate the advantage of distributed diversity. 

An alternative solution to exploit correlation among users is represented by joint 
source-channel coding. In this case, no cooperation among nodes is required and the 
correlated sources are not source encoded but only channel encoded at a reduced rate 
(with respect to the uncorrelated case). The reduced reliability due to channel coding 
rate reduction can be compensated by exploiting intrinsic correlation among different in- 
formation sources at the channel decoder. Such an approach has attracted the attention 
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of several researchers in the recent past on account of its implementation simplicity [16], 
|17] . [IB], [IBj. Works dealing with joint source-channel coding have so far considered 
classical turbo or LDPC codes, where the decoder can exploit the correlation among 
sources by performing message passing between the two decoders. However, in order to 
exploit the potentialities of such codes it is necessary to envisage very long transmitted 
sequences (often in the order of 10000 bits or even longer), a situation which is not so 
common in wireless sensor networks' applications where in general the nodes have to 
deliver a small packet of bits. Of course, the same encoding and decoding principles of 
turbo/LDPC codes can be used with shorter block lengths, but the decoder's perfor- 
mance becomes in this case similar to that of classical block or convolutional codes. 

In this paper, we will consider a joint source-channel coding scheme based on a low- 
complexity (i.e., small number of states) convolutional coding scheme. In this case, both 
the memory requirement at the encoder and the transmission delay are of very few bits 
(i.e., the constraint length of the code). Moreover, similarly to turbo or LDPC schemes, 
the complexity at the decoder can be kept low thanks to the use of an iterative joint 
decoding scheme, where the output of each decoder is fed to the other decoder's input as 
a-priori information. It is worth noting that when a convolutional code is used to provide 
forward error correction for packet data transmissions, we are in general interested in 
the average probability of block (or packet) error rather than in the bit error rate |20j . 
In order to manage the problem complexity, we assume that a-priori information is ideal, 
i.e., it is identical to the original information transmitted by the other encoder. In this 
case, the correlation between the a-priori information and the to-be-decoded bits is still 
equal to the original correlation between the information signals, and the problem turns 
out to be that of Viterbi decoding with a-priori soft information. 

To the best of my knowledge, the first paper which studies this problem is an old 
paper by Hagenauer [21]. The bounds found by Hagenauer are generally accepted by 
the research community, and a recent paper ^22j uses such bounds to evaluate the 
performance of a joint convolutional decoding system similar to the one proposed in this 
paper. Unfortunately, the bounds found by Hagenauer are far from being satisfying, 
as we will show in Section IV. In particular, in [21] it is assumed a perfect match 
between the a-priori information hard decision parameter, i.e., the sign of the a-priori 
log-likelihood values, and the actually transmitted information signal. On the other 
hand, in [22| the good match between simulations and theoretical curves is due to the 
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use of base-10 logarithm instead of the correct natural logarithm. Hence, this paper 
removes the assumptions made in [21] and a novel analytical framework, where the 
packet error probability is evaluated by averaging over all possible configuration of a- 
priori information, is provided. Such an analysis is then considered for deriving optimal 
coding schemes for the scenario proposed in this paper. 

This paper is organized as follows. Section II describes the proposed scenario and 
gives notations used throughout the rest of the paper. In Section III, starting from 
the definition of the optimum MAP joint-decoding problem, we derive a sub-optimum 
iterative joint-decoding scheme. Section IV and V illustrate the analysis which allows to 
evaluate the packet error probabilities of convolutional joint-decoding and to derive the 
optimum code searching strategy. Finally, Section VI shows results and comparisons. 

II. Scenario 

Let's consider the detecting problem shown in Figure [TJ We have two sensor nodes, 
namely SNi and S'A^2; which detect the two binary correlated signals X and Y, respec- 
tively. Such signals, referred to as information signals in the following, are taken to be 
i.i.d. correlated binary randon variables with Pr {xi = 1/0} = {Vi = 1/0} — 0-5 and 
correlation p = Pj. {xj = i/i} > 0.5. 

The information signals, which are assumed to be detectable without error (i.e., ideal 
sensor nodes), must be delivered to the access point node (AP). To this aim, sensor 
nodes can establish a direct link toward the AP. We assume that the communication 
links are affected by independent link gains and by additive AWGN noise. Referring to 
the vectorial equivalent low-pass signal representation, we denote to as s the complex 
transmitted vector which conveys the information signal, a the complex link gain term 
which encompasses both path loss and fading, and n the complex additive noise. As 
for the channel model, we assume an almost static system characterized by very slow 
fading, so that the channel link gains can be perfectly estimated at the receiver . 

Let's assume that each transmitter uses a rate r = k/n binary antipodal chan- 
nel coding scheme to protect information from channel errors, and denote to as x = 
{xo,xi, . . . ,Xk-i) and z = {zq, zi, . . . , Zn-i), with Zi = ±1, the information and the 
coded sequences for SNi, respectively. In an analogous manner, y = {yo,yi, . . . ,yk-i) 



^This assumption is reasonable since in most wireless sensor networks' applications sensor nodes are static or 
almost static 
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Fig. 1 

The proposed two sensing nodes scenario 



and w = {wq, Wi, . . . , Wn-i), with Wi = ±1, are the information and the coded sequences 
for SN2. 

Eventually, let's denote to as E{-) the mean operator and introduce the following terms: 
= E (^\sxf , is the energy per coded sample transmitted by SNi, = E (^\syf / 2j , 
is the energy per coded sample transmitted by SN2, = \ctx\ , is the power gain term 
for the first link, Gy = \<yyf, is the power gain term for the second link, (^In^:!^^ = 
E ^Inyl"^^ = 2Nq, is the variance of the AWGN noise. 

The coded sequence is transmitted into the channel with an antipodal binary modu- 
lation scheme (PSK), i.e., s^^i = Zi^f2^^ Sy^i = Wi^2^y. Hence, denoting to as u^^i and 
Uy^i the decision variable at the receiver, we get: 



Ui^x Ziy^GxS^ ~\~ 'f]i,x 



'^i,y — Wi^2Gy^y + Tjiy 

where rji^x-, Vi,y Gaussian random noise terms with zero mean and variance A'^q. The 
energy per information bit for the two links can be written as ^b,x = and ^h,y = —y^-, 
respectively. Denoting to as C,c,x = f^h^x and ^c,y = r^b,y the received energy per coded 
bit for the two links, we can rewrite equation ([1]) as: 



'^i,x ^iy2^c,x ~l~ Vi,x , , 

I 

^i,y '^i\l '^^c,y ~l~ Vi,y 

Note that the same model attains also for a more efficient quaternary modulation scheme 
4 
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(QPSK), where two coded symbols are transmitted at the same time in the real and 
imaginary part of the complex transmitted sample. 

III. Iterative joint-decoding 

The decoders' problem is that of providing an estimation of x and y given the ob- 
servation sequences u^; and Uj^. Since x and y are correlated, the optimum decoding 
problem can be addressed as a MAP joint decoding problem: 

{x, y} = arg max Pr {x, y|u^, u^} (3) 

where x and y are the jointly estimated information sequences. 

Although its optimality, such a joint decoding scheme requires in general a huge com- 
putational effort to be implemented. As a matter of fact, it requires a squared number 
of operation per seconds with respect to unjoint decoding. Such an implementation 
complexity is expected in many cases to be too high, particularly when wireless sensor 
networks' applications are of concern. In order to get a simplified receiver structure, 
let's now observe that by using the Bayes rule equation ([3]) can be rewritten as: 

{x, y} = arg max Pr {x|y, u^;, Uj,} Pr {y|u^, u^} (4) 

The above expression can be simplified by observing that u^, is e noisy version of y and 
that the noise is independent of x. Hence, can be rewritten as: 

{x, y} = arg max Pr {x|y, u^^} Pr {-y\u^, Uy} (5) 

By making similar considerations as above, it is straightforward to derive from ([5]) the 
equivalent decoding rule: 

{x, y} = arg max Pr {y|x, u^} Pr {x|u^, u^} (6) 
x,y 

Let's now consider the following system of equations: 

X = arg max Pr {x|y, u^^} Pr {y\ux, Uy} 

(7) 

y = arg max Pr {y|x, u^} Pr {x|ua;, u^} 
y 

It is straightforward to observe that the above system has at least one solution, that is 
the optimum MAP solution given by ([S]) or ([2]). 
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It is also worth noting that Pr {yju^-, Uj^} and Pr {x|u^, u^} are constant terms in (^^. 
Therefore, the decoding problem ([7]) can be rewritten as: 

X = arg max Pr {x|y, u^;} 

(8) 

y = arg max Pr {y|x, u^} 
y 

In dH]) the decoding problem has been split into two sub-problems: in each sub-problem 
the decoder detects one information signal basing on a-priori information given by the 
other decoder. A-priori information will be referred to as side-information in the follow- 
ing. 

A solution of the above problem could be obtained by means of an iterative approach, 
thus noticeably reducing the implementation complexity with respect to optimum joint 
decoding. However, demonstrating if the iterative decoding scheme converges and, if it 
does, to which kind of solution it converges, is a very cumbersome problem which is out 
of the scope of this paper. As in the traditional turbo decoding problem, we are instead 
interested in deriving a practical method to solve (jH]). 

To this aim, classical Soft Input Soft Output (SISO) decoding schemes, where the 
decoder gets at its input a-priori information of input bits and produce at its output a 
MAP estimation of the same bits, can be straightforwardly used in this scenario. MAP 
estimations and a-priori information are often expressed as log-likelihood probabilities 
ratios, which can be easily converted in bit probabilities [23]. Let denote by Pi {xi} 
and Pj {yi} the a-priori probabilities at the SISO decoders' inputs, and by Pq {xi} and 
Po {Vi} the a-posteriori probabilities evaluated by the two decoders. In order to let the 
iterative scheme working, it is necessary to convert a-posteriori probabilities evaluated 
at j — th step into a-priori probabilities for the (j + 1) — th step. According to the 
correlation model between the information signals, we get: 

Pi {yi\ = Po {x^} X p + (1 - Po {Xi}) X (1 - p) 
Pi [xi] = Po {Vi] X p + (1 - Po {Vi}) X (1 - p) 

As for the decoding scheme, we consider the Soft Output Viterbi Decoding (SOVA) 
scheme depicted in [23j. Denoting to as T the SOVA decoding function, the overall 
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Fig. 2 

SOVA Iterative decoding scheme 



iterative procedure can be summarized as: 

^ {x,} = 0.5; 
forj = 1,N 

P^^ {x.} = T {Pj^^ {x.},u,.); 

Pi'^ {y^} = PiS^ {^^] X P + (l - Pi^^ {a:.}) X (1 - p) ; (10) 

Pii^ {y^} = T (P?'^ {y^} , uy) ■ 

Pj'^ {x,} = P(/) {y,} X p + (l - P(/) {y,}) X (1 - p) ; 
end; 

where N is the number of iterations. In Figure [2] the iterative SOVA joint decoding 
scheme described above is depicted. We assume that the correlation factor p between the 
information signals is perfectly known/estimated at the receiver. Such an assumption 
is reasonable since p is expected to remain almost constant for long time. 

IV. Pairwise error probability 

We now are interested in evaluating the performance of the iterative joint- decoding 
scheme. To this aim, we consider a simplified problem where the side-information pro- 
vided to the other decoder is without errors, i.e., it is equal to the original information 
signal. Without loss of generality, let focus on the first decoder: 

yi = arg max Pr {'x\y,Ux} (11) 
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where y is the information signal which has been actually acquired by the second sensor. 
On account of the ideal side-information assumption, y is correlated with x according 
to the model Pr {xi = Vi} = p- To get an insight into how the ideal side- information 
assumption may affect the decoder's performance, let's start by denoting to as = x©y 
the information signals' cross-error profile, x being the information signal which has been 
actually transmitted by the first transmitter. Moreover, let denote to as = y © y 
the error profile of the second decoder after decoding ([8]). If we make the reasonable 
assumption that and are independent, the actual side-information y is correlated 
with X according to the model Pr {xi = iji} = p' < p, where: 

p' = px(i-n) + (i-p)xn (12) 

and Ph = Pr {yi yi} is the bit error probability. It is clear from the above expression 
that for small Pb we get p' = p, i.e., we expect that for low bit error probability, the ideal 
side- information assumption leads to an accurate performance evaluation of the iterative 
decoding ([8]). This expectation will be confirmed by comparisons with simulation results 
in Section V. 

By using the Bayes rule and by putting away the constant terms (i.e., the terms which 
do not depend on x), it is now straightforward to get from ffTTj) the equivalent decoding 
rule: 

X = ar^f max Pr {u^.|x} Pr {x|y} (13) 

X 

Substituting for u^; the expression given in ([2]) and considering the AWGN channel 
model proposed in the previous Section, f[T5]) can be rewritten as: 



X = arg max 

X 

Let's now denote by x^ the transmitted information signal, and by Xe 7^ x^ the estimated 
sequence. Moreover, let's denote by Zg 7^ the corresponding codewords and by ■jh^x = 
Conditioning to y, the pairwise error probability for a given 'jh^x can be defined as 
the probability that the metric iHM evaluated for z = Zg and x = Xg is higher than that 
evaluated for z = zt and x = x^. Such a probability can be expressed as: 

Pe (Xi,xe,7,,,|y) = Pr 1^2^^ {z^,e " z,,t) - No X In > o| (15) 

Let's now introduce the hamming distance dz = D {zt, Zg) between the transmitted and 
the estimated codewords. Substituting for u^; in f|T5|) the expression given in ([2]), it is 

8 
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(14) 
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straightforward to obtain: 



Pe (x4,Xe,7fe,x|y) = 0.5er/c 



V 4:,y7d~^ \Pr{:x-e\y}J 



(16) 



where 'jb^x = ^ and erfc is the complementary error function. Notice that the term in 
( TT6ll which takes into account the side-information y is given by the natural logarithm 
of a ratio of probabilities. It is straightforward to note that such a term can be positive 
or negative, depending wether the Hamming distance D (x^, y) is higher or lower than 
D (xg, y). Of course, for high p, the probability that such term becomes negative is low, 
and hence one expects that on the average the effect of a-priori information is positive, 
i.e., it increases the argument of the erfc function or, equivalently, it reduces the pairwise 
error probability. To elaborate, let's now introduce: 



^i,t = Xi^t © Vi 

TijE 2^1,6 © Vi 

where © is the XOR operator. Hence, it can be easily derived: 



(17) 



Sfei = M = n P^--^^'' X (1 - p)r.-r.,e (18) 

1=0 

The above expression can be further simplified by observing that Fj ^ — Fj g is different 
from zero only for Xi^t © a^i,e = 1- Hence, by introducing the set I = {i : x-i^t © Xi^e = 1}, 
equation (fT6!) can be rewritten: 



Pe (xt,Xe,7b,^.|y) = 0.5erfc 



rd.jb, + J^ln (n P^--^^'* X (1 - p)r--^^.y 



9) 



Let's introduce the term as the Hamming distance between the transmitted and the 

fc-i 

estimated information signals, i.e., d^ = Xi,t © 2^4, e- 

Notice that dx is the dimension 

i=0 

of the set / and, hence, the product over / in f|T9|) is a product of dx terms. 
The problem of evaluating the pairwise error probability in presence of a-priori soft 
information has already been derived in a previous work [2T] and cited in a recent work 
[22] • In [2Ij and [22] the a-priori information is expressed as log-likelihood value of 
the information signal and is referred to as L (e.g., see equation (5) of [22] )■ Notice 
that, according to the notations of this paper, such a log-likelihood information can be 
expressed as L = ln(^j^^. Note also that in equation (5) of [22] the pairwise error 

probability is expressed as Pd = \erfc ( \j ^ + ^ irdE^/i^ ] ' through easy 
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mathematics, becomes Pd = \erfc ( + — — 7=^ )• Hence, in EH and [221 the 

logarithm of the product over / (fT9l) is set equal to the sum of the a-priori information 
log-likelihood values of Xi^t, i-e., it is set equal to = d^L. Considering the notation 
of this paper, this is equivalent to set Fj^e = 1 and Ti^t = 0, for i & I, i.e., to assume that 
there is a perfect match between the a-priori information y and the actually transmitted 
information x. This assumption would lead to heavily underestimate the pairwise error 
probability, as it will be shown at the end of this Section. 

To further elaborate, notice that the terms p'^'.e"^*.* x (1 — p'^^i,t-'^i,e ^ with i E I, can 
take the following values: 

I) if © = 

II) i^, if Xi^t®yi = 1 

Let's now define by Ei = {xi^t © Vi), the logical not of Xi^t®yi- Then, Pe can be rewritten 
as: 



Pe (xt,Xe,7b,:E|y) = 0.5er/c< Jrd^'jb,x + 



^y/rdl-fb,: 



--In 



(20) 



where indexes i{k), k = 1, . . . ,dx are all the elements of the set /. Note that Pe expressed 
in (|20i) is a function of Si, i & I, rather then of the whole vector y. Hence, we can write: 



Pe (xt,Xe,7fe,a:ki(i),£^i{2), • • • , = 0.5er/c { ^7^47^+ 



'iy/rd^Tb~r 



In 



(21) 



Notice that is by definition equal to one with probability p and equal to zero with 
probability I — p. Hence, it is possible to filter out the dependence on Ej in (l20l) . thus 
obtaining an average pairwise error probability given by: 



Pe (Xt,Xe,7b_; 



E ••• E Pe[^t,^e,1b,x\£iil), ■ ■ ■ ,£i{dx) 
=10,1) £,M.1=|0,1} ^ 



dx dx 

Xpfe=i (1 - p) 



(22) 



It is now convenient for our purposes to observe from (12T!) and ( !22|) that the pair- 
wise error probability can be extensively expressed as a function of solely the hamming 
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distances dz and as: 



Pe{dz,d^,-fb,x) = E ••• E 0.5erfc\Jrdz^b,x+ 

e,(i)={0,l} e,(d^)={0,l} 



=ln 



f \ k = l 
1-p 



(^) 



, \^ (23) 

X P^=i (1-p) 



Equation fl23l) gives rise to interesting considerations about the properties of good 

dx 

channel codes. In particular, let's observe that the term J2 ^i{k) plays a fundamental 

k=l 

role in determining the pairwise error probability. Indeed, making the natural assump- 

dx 

tion p > 0.5, if J2 £i(k) < [dx/2\ the argument of the logarithm is less than one, and, 

k=l 

hence, the performance is affected by signal-to-noise-ratio reduction (the argument of 

dx 

the erfc function diminishes). Note that, the lowest J2 ^i(k) the highest the performance 

fc=i 

degradation. Hence, it is important that such bad situations occur with low probability. 
On the other hand, the highest dx-, the lowest the probability of bad events which is 

dx 

dx-Y^ Siik) 

mainly given by the term (1 — p) *=i . Hence, it is expected that a good code 
design should lead to associate high Hamming weight information sequences with low 
Hamming weight codewords. To be more specific, if we consider convolutional codes it 
is expected that recursive schemes work better than non-recursive ones. This conjecture 
will be confirmed in the next Sections. 

To give a further insight into the analysis derived so far, and to provide a comparison 
with the Hagenauer's bounds reported in [21] and [22], let's now consider the uncoded 
case. In this simple case r = k = n = 1, xt = zt, Xe = Ze (we have mono-dimensional 
signals), and d^ = dz = 1- Moreover, the pairwise error probability becomes the proba- 
bility to decode -|-1/ — 1 when — 1/ + 1 has been transmitted, i.e., it is equivalent to the 
bit error probability. Without loss of generality, we assume that the side-information 
is y = 1, so that we can denote by L{x) = In (jz^) the log-likelihood value of a-priori 
information for the decoder. It is straightforward to get from fl23|) : 

Pe (7m) = 0.5er/c (^7^ + ^) x p + 0.5er/c (^7^ - ^) x (1 - p) (24) 

By following the model proposed in [21], we would get: 

Pe (7m) = 0.5er/c + ^) (25) 

In Fig. [3] we show the Pe curves as a function of p, computed according to and 
f l25|) and referred to as Ci and C2, respectively. Two different 'jb,x values are considered: 
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Fig. 3 

Bit error probability curves in the uncoded case 



7b,2; = 1 dB and 76,^ = 4 dB. 

By running computer simulations we have verified that, as expected, Ci represents an 
exact calculation of the bit error probability (simulation curves perfectly match Ci). 
Accordingly, it is evident that the approximation ( l25l) is not satisfying. On the other 
hand, in [22] the good match between simulations and theoretical curves is due to the 
use of base-10 logarithm instead of the correct natural logarithm. As a matter of fact, by 
using the correct calculation of L{x) one would observe the same kind of underestimation 
of bit error probability as shown in Fig. [3l 

V. Packet error probability evaluation and Optimal convolutional 

CODE SEARCHING STRATEGY 

In this Section, and in the rest of the paper, we consider convolutional coding schemes 
|23j . [24j . Such schemes allow an easy coding implementation with very low power and 
memory requirements and, hence, they seem to be particularly suitable for utilization in 
wireless sensors' networks. Let's now focus on the evaluation of packet error probability 
at the decoder in presence of perfect side-information estimation. As in traditional 
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convolutional coding, it is possible to derive an upper bound of the bit error probability 
as the weighted ^ sum of the pairwise error probabilities relative to all paths which 
diverge from the zero state and marge again after a certain number of transitions ^3] . 
This is possible because of the linearity of the code and because the pairwise error 
probability fl23l) depends only on input and output weights and d^, and not on the 
actual transmitted sequence. 

In particular, it is possible to evaluate the input-output transfer function T{W, D) by 
means of the state transition relations over the modified state diagram [23] . The generic 
form of T{W, D) is: 

T(l^,D) = (26) 

where (3w,d denotes the number of paths that start from the zero state and reemerge 
with the zero state and that are associated with an input sequence of weight w, and an 
output sequence of weight d. Accordingly, we can get an upper bound of the bit error 
probability of x as: 

Pb,x < E /^S X w X Pe (rf, 7b,,.) (27) 

w,d 

where j3^^\ is the (3w,d term for the first encoder's code and Pe{d,w,'jb,x) is the pairwise 
error probability (|23|) for d^ = d and dx = w. On account of the symmetry of the 
problem ([7]), the union bound of the bit error probability of y is: 

Pb,y < E A^d X W X Pe {d, W, 76,y) (28) 
w,d 

where f]^^^^ is the (3^4 term for the second encoder's code and 'jb^y = 
Following a similar procedure, it is then possible to derive the packet error probabilities. 
To this aim, let's start by denoting to as Lp^t the packet data length and let's assume 
that Lpkt is much higher than the constraint lengths of the codes (the assumption is 
reasonable for the low complexity convolutional codes that are considered in this paper). 
In this case, since the first-error events which contribute with non negligible terms to 
the summations fl27|l and fl28l) have a length of few times the code's constraint length, 
we can assume that the number of first-error events in a packet is equal to Lpfctlf . Hence, 

■^The weights are the information error weights 
^In other terms we neglect the border effect 
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Table I: Generator polynomials of the optimum codes 

the upper bounds P^^x and P^^y of the packet error rate can be easily derived as: 

Pd,x < E Pl^d X Lpkt X Pe (d, W, -fb,x) 

n.,d (29) 

Pd,y < E Pl^;d X Lpkt X Pe {d, W, -fb,y) 
w,d 

Basing on the procedure derived above, it is now possible to implement an exhaustive 
search over all possible codes' structures with the aim of finding the optimum code, 
intended as the code which minimizes the average packet error rate upper bound Pd = 
^ '^'^ . We will assume in the following that sensor 1 and sensor 2 use the same code, 
and that k — 1 and n — 2. In this situation, a code is univocally determined by the 
generator polynomials G^^\D) = gl^^ x + g^^liD"-'^ + ^^2^''"^ + . . . + g?D^ + g^^\ 
G^'^\D) = gf^ xD^ + 9?liD^~^ + ^^i^''"^ + . . . + g?^ + ^ and by the feedback 
polynomial H{D) = x D" + h^.^D"'^ + hy_2D''~'^ + . . . + hiD^ + ho, where v is the 
number of shift registers of the code (i.e., the number of states is 2*^) and g^^"* = {0, 1}, 
g^^ = {0, 1}, hk = {0, 1}. Hence, the exhaustive search is performed by considering all 
possible polynomials, i.e., all 2=^(^+^) possible values of G^^^D), G^^^D), and H{D). It 
is worth noting that when H{D) — the code is non-recursive while when H{D) ^ 
the code becomes recursive. Table I shows the optimum code's structure obtained by 
exhaustive search for 76a; = = 3 dB and for v = ?>. Three different values of p, i.e., 
p — 0.8, p = 0.9 and p = 0.95, has been considered and three different codes, namely 
Cso, Cgo and C95, have been correspondingly obtained. 

As it is evident from previous Sections' analysis, the optimum code structure depends 
on the signal to noise ratios, i.e., different values of 76^3, and "y^^y lead to different optimum 
codes. However, by running the optimum code searching algorithm for a set of different 
signal to noise ratios, we have verified that the optimum code's structure remain the 
same over a wide range of jb^x and 'jb,y and, hence, we can tentatively state that Cgo, 
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C90 and C95 are the optimum codes for i/ = 3 and for p = 0.8, p = 0.9 and p = 0.95. 

VI. Results and comparisons 

In order to test the effectiveness of the code searching strategy shown in Section IV, 
computer simulations of the scenario proposed in this paper have been carried out and 
comparisons with the theoretical error bounds have been derived as well. In the simu- 
lated scenario, channel decoding is based on the iterative approach described in Section 
V. 

The results are shown in Figs. HHTl In particular, in Fig. |l]and|5]we set p = 0.8 while in 
Fig. [6] and [7] we set p = 0.9. Besides, a packet length Lp^t = 100 is considered in Figs. H] 
and[6l while a packet length Lpkt = 50 is considered in Figs. [5]and[3 In the legend, sim. 
indicates simulation results and bounds indicates theoretical bounds. Different values 
of 1b,x = lb,y have been considered in all Figs, and indicated in the abscissa as 7^,. In the 
ordinate we have plotted the average packet error probability Pd = ^i^±^. In these 
Figures we show results for the optimum recursive codes reported in Table I, referred 
to as Cr, and for the G^^\D) = 0^ + 0"^ + I, G^'^\D) = + + D + 1 non-recursive 
code which is optimum in the uncorrelated scenario |24]. Results obtained for the non- 
recursive code has been derived for both the joint detection and the unjoint detection 
case, and are referred to as Cnr-jd and Cnr-ud, respectively 0. Unjoint detection means 
that the intrinsic correlation among information signals is not taken into account at the 
receivers and detection depicted in Figure [2] is performed in only one step. In this case 
soft output measures are not necessary and, hence, we use a simple Viterbi decoder with 
hard output. 

Notice that, according to the analysis discussed in the previous Sections, the theoret- 
ical error bounds are expected to represent packet error probability's upper bounds 
(e.g., union bound probabilities). As a matter of fact, the theoretical bounds actually 
represent packet error probability's upper bounds for low packet error rates, when the 
assumption p' = pis reasonable (fT3l) . Instead, for high packet error rates, i.e., for low 
the theoretical bounds tend in some cases to superimpose the simulation curves. This 



''We do not use the same notation for the optimum recursive code Cr since in this case we only perform joint 
detection. On the other hand, the unjoint detection case is equivalent to the uncorrelated case, where C'„r is 
the optimum code. 
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is because for high bit error rates, i.e., for high packet error rates, the side- information 
is affected by non neghgible errors and the hypothesis of perfect side information made 
in the analysis is not vahd anymore. However, the theoretical bounds represent in all 
cases a good approximation of the simulation results. 

By observing again Figs. 01171 the following conclusions can be drawn. The optimum re- 
cursive codes allows to get an actual performance gain with respect to the non-recursive 
scheme, thus confirming the validity of the theoretical analysis described in previous 
Sections. Such a performance gain is particularly evident for high p values, e.g., the 
performance gain at = 0.01 is nearly of 0.6 dB for p = 0.9 while for p = 0.8 the gain 
is less then 0.3 dB. Comparisons with the unjoint detection case show that, as expected, 
joint detection allows to get a noticeable performance gain with respect to the unjoint 
case (from 0.6 dB for p = 0.8 to more than 1.3 dB for p = 0.9). 

In order to assess the validity of the joint source-channel coding approach considered in 
this paper, let's now provide a comparison with a transmitting scheme which performs 
distributed source coding achieving the Slepian-Wolf compression limit, and independent 
convolutional channel coding. Note that such a scheme is ideal, since the Slepian-Wolf 
compression limit cannot be achieved with practical source coding schemes. For compar- 
ison purposes, we focus on the p = 0.9393 case and we start by observing that the ideal 
compression limit is equal to the joint entropy of the two information signals i/(x, y) = 
H{-x) + H{-x\y) = 1 — p X log2{p) — (1 — p) x log2{l — p) = 1.33. In order to get a fair 
comparison, let's now assume that the transmitter with ideal Slepian Wolf compressor, 
referred to as SW in the following, has at its disposal the same total energy and the 
same transmitting time as the joint source-channel coding transmitter without source 
compression proposed in this paper, referred to as JS — CC m the following. This 
means that the SW transmitters can use the same energies S,x and S,y as the JS — CC 
transmitters and a reduced channel coding rates rg^, = x r = 2/3r, r being the 
channel coding rate for JS — CC. To be more specific, considering again r = 1/2 for 
the JS — CC case, the SW transmitting scheme can be modeled as two independent 
transmitters which have to deliver Lpkt,sw = "^/^Lp^t independent information bits each 
one 0, using a channel rate r^^o = 1/3 and transmitting energies and ^y. As for 



* Since the SW scheme performs ideal distributed compression, the original correlation between information 
signals is fully lost 
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the JS — CC transmitting scheme, we consider both the recursive C95 channel coding 
scheme shown in Table 1 and the r = 1/2 non- recursive coding scheme described above. 
As before, the two cases are referred to as Cr and Cnr-jd, respectively. Note that in both 
cases we perform the iterative joint decoding scheme described in the previous Section 
in an attempt to exploit the correlation between information signals. Instead, since 
distributed compression fully eliminates the correlation between information signals, in 
the SW case unjoint detection with hard Viterbi decoding is performed at the receiver. 
As for the channel coding scheme, we consider in the SW case a non-recursive 1/3 
convolutional code with v = 3 and with generator polynomials G'^'^^D) = + D + I, 
G^^^D) = D^ + D^ + 1, G'(3)(D) = + + D + 1, 

In order to provide an extensive set of comparisons between Cr, Cnr-jd and SW we 
consider a more general channel model than the AWGN considered so far. In particular, 
we assume that the link gains ax and ay are RICE distributed i24j with RICE factor 
Kji equal to (i.e., Rayleigh case), 10, and 00 (i.e., AWGN case). The three cases 
are shown in Figs. [HI M and dUl respectively. We consider in all packet length 

Lpkt = 100. Moreover, we assume that the two transmitters use the same transmitting 
energy per coded sample ^ = = (,y In the abscissa we show the average received 
power E{^rx) = E (|aa;p) x = E (|ayP) x expressed in dB. Note that the average 
7b terms can be straightforwardly derived as Eij}^) = ^^^^ = E{^j.x) for the Cr and 
Cnr-jd cases, and Ei'yA = M^iii = 1,5 x E(frx) for the SW case. It is worth noting 
that the comparisons shown in Figs. [HI E] and [TO] are fair in that Cr, Cnr-jd and SW use 
the same global energy to transmit the same amount of information bits in the same 
delivering time. 

Notice from Fig. [H]that in the AWGN case SW works better than the other two schemes, 
even if the optimum recursive scheme Cr allows to reduce the gap from more then one dB 
to a fraction of dB. The most interesting and, dare we say, surprising results are shown 
in Figs. [9] and [TOl where the Cr decoding scheme clearly outperform SW with a gain of 
more then 1 dB in the Rayleigh case and of almost 1 dB in the Rice case, while Cnr-jd 
and SW perform almost the same. This result confirms that, in presence of many-to- 
one transmissions, separation between source and channel coding is not optimum. The 
rationale for this result is mainly because in presence of an unbalanced signal quality 
from the two transmitters (e.g., independent fading), leaving a correlation between the 
two information signals can be helpful since the better quality received signal can be 
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used as side information for detecting the other signal. In other words, the proposed 
joint decoding scheme aUows to get a diversity gain which is not obtainable by the 5*1^ 
scheme. Such a diversity gain is due to the inherent correlation between information 
signals and, hence, can be exploited at the receiver without implementing any kind of 
cooperation between the transmitters. 

VII. Conclusions 

A simple wireless sensor networks scenario, where two nodes detect correlated sources 
and deliver them to a central collector via a wireless link, has been considered. In this 
scenario, a joint source-channel coding scheme based on low-complexity convolutional 
codes has been presented. Similarly to turbo or LDPC schemes, the complexity at the 
decoder has been kept low thanks to the use of an iterative joint decoding scheme, 
where the output of each decoder is fed to the other decoder's input as a-priori in- 
formation. For the proposed convolutional coding/decoding scheme we have derived 
a novel analytical framework for evaluating an upper bound of joint-detection packet 
error probability and for deriving the optimum coding scheme, i.e., the code which min- 
imizes the packet error probability. Comparisons with simulation results show that the 
proposed analytical framework is effective. In particular, in the AWGN case the op- 
timum recursive coding scheme derived from the analysis allows to clearly outperform 
classical non-recursive schemes. As for the fading scenario, the proposed transmitting 
scheme allows to get a diversity gain which is not obtainable by the classical Slepian- 
Wolf approach to distributed source coding of correlated sources. Such a diversity gain 
allows the proposed scheme to clearly outperform a Slepian-Wolf scheme based on ideal 
compression of distributed sources. 
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Simulations results and theoretical bounds for p = 0.8 and Lp^ = 100 
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Simulations results and theoretical bounds for p = 0.9 and Lpkt = 50 



22 



A. Abrardo, "Design of optimal convolutional codes for joint decoding of correlated sources in wireless sensor networks" 



10" 



10 C3 



10" 



10" 



10" 



10" 



-0- 


C . , Sim. 

nr-]d 


_□_ 


C Sim. 

r 


-o- 


SW Sim. 



:::^:a- 



>,,,■,,, ^, . 
, ,' X . . . 



1 1.5 2 2.5 3 3.5 4 4.5 5 

E(tJ (dB) 



Fig. 8 

Comparison with the SW case: AWGN channel 
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Comparison with the SW case: Rayleigh channel model 
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Comparison with the SW case: Rice channel model with Kn = 10 
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